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Abstract 

Over the last decades, researchers have characterized a set of "clock genes" that drive daily rhythms in physiology and 
behavior. This arduous work has yielded results with far-reaching consequences in metabolic, psychiatric, and neoplastic 
disorders. Recent attempts to expand our understanding of circadian regulation have moved beyond the mutagenesis 
screens that identified the first clock components, employing higher throughput genomic and proteomic techniques. In 
order to further accelerate clock gene discovery, we utilized a computer-assisted approach to identify and prioritize 
candidate clock components. We used a simple form of probabilistic machine learning to integrate biologically relevant, 
genome-scale data and ranked genes on their similarity to known clock components. We then used a secondary experi- 
mental screen to characterize the top candidates. We found that several physically interact with known clock components in 
a mammalian two-hybrid screen and modulate in vitro cellular rhythms in an immortalized mouse fibroblast line (NIH 3T3). 
One candidate, Gene Model 129, interacts with BMALl and functionally represses the key driver of molecular rhythms, the 
BMALl/CLOCK transcriptional complex. Given these results, we have renamed the gene CHRONO (computationally 
highlighted repressor of the network oscillator). Bi-molecular fluorescence complementation and co-immunoprecipitation 
demonstrate that CHRONO represses by abrogating the binding of BIVIALI to its transcriptional co-activator CBP. Most 
importantly, CHRONO knockout mice display a prolonged free-running circadian period similar to, or more drastic than, six 
other clock components. We conclude that CHRONO is a functional clock component providing a new layer of control on 
circadian molecular dynamics. 
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Introduction 

Circadian rhythms are ubiquitous in daily life, coordinating the 
sleep-wake cycle along with oscillations in hormone secretion, 
blood pressure, and cognitive function [1,2]. While a central 
master-pacemaker is located in the suprachiasmatic nuclei (SCN) 
of the hypodialamus, cell autonomous rhythms are generated 
throughout the body. The CLOCK/BMALl transcriptional 
complex lies at the core of the molecular clock. These proteins 
bind E-box elements in the promoters of target genes [3] . The 



Period and Cryptochrome gene families are prominent among 
these targets, and their products ultimately repress CLOCK/ 
BMALl activity and their own transcription [4,5]. A second loop 
regulates Bmall expression through the opposing actions of the 
REV-ERB and ROR nuclear receptor protein families [6,7]. 
Circadian oscillations are in turn subject to multiple layers of 
control. The casein kinase I proteins (CSNKID and CSNKIE) 
and the F-box and leucine-rich repeat proteins (FBXL3, FBXL21) 
[8- 1 0] regulate the nuclear accumulation and/ or stability of clock 
components, respectively. Moreover, recent evidence highlights 
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Author Summary 

Daily rhythms are ever-present in the living world, driving 
the sleep-wake cycle and many other physiological changes. 
In the last two decades, several labs have identified "clock 
genes" that interact to generate underlying molecular 
oscillations. However, many aspects of circadian molecular 
physiology remain unexplained. Here, we used a simple 
"machine learning" approach to identify new clock genes 
by searching the genome for candidate genes that share 
clock-like features such as cycling, broad-based tissue RNA 
expression, in vitro circadian activity, genetic interactions, 
and homology across species. Genes were ranked by their 
similarity to known clock components and the candidates 
were screened and validated for evidence of clock func- 
tion in vitro. One candidate, which we renamed CHRONO 
{Gm129), interacted with the master regulator of the clock, 
BMALl, disrupting its transcriptional activity. We found that 
Chrono knockout mice had prolonged locomotor activity 
rhythms, getting up progressively later each day. Our 
experiments demonstrated that CHRONO interferes with 
the ability of BMALl to recruit CBP, a bona fide histone 
acetylase and key transcriptional coactivator of the circa- 
dian clock. 



the importance of metabolic cofactors and histone modifiers (e.g., 
HDAC3, P300, CBP, SIRTl, and NAMPT) in modulating these 
feedback loops. 

The understanding of circadian timekeeping has demonstrated 
far-reaching importance. Allelic variation in clock components has 
been associated with circadian, sleep, and mood disorders [8,11- 
1 3] . Mutational and epidemiologic studies have linked clock genes 
with neoplastic and metabolic phenotypes [2,14]. However, the 
current model of the circadian pacemaker is likely incomplete. 
Indeed, quantitative circadian trait analysis maps most loci to 
regions unassociated with known clock genes [15]. In an attempt 
to identify these missing regulatory components, researchers have 
moved beyond the costly and laborious mutagenesis screens that 
identified the first clock components [16,17]. Recent studies have 
turned to higher throughput genomic and proteomic approaches. 
A screen for activators of BMALl transcription [7], a screen for 
proteins that bind CLOCK [18], and proteomic analysis of the 
BMALl [19] and PERIOD [20,21] protein complexes have aU 
identified proteins that function in circadian control. 

Here we present an alternative, computer-assisted approach 
armed at accelerating clock gene discovery. We used probabilistic 
machine learning to integrate heterogeneous, genome-scale data- 
sets [22-24] and identify candidate clock genes that functionally 
resemble known clock components. We screened the top candi- 
dates for physical interactions with a subset of clock components 
using a mammalian two-hybrid assay. Candidates were further 
screened for circadian function in an in vitro system. We focused 
our attention on three promising initial candidates. Here we 
demonstrate the utility of this approach with data from the first of 
these candidates. Gene Model 129 {Gml29), to have its circadian 
fiinction characterized in both cells and knockout mice. We 
confirmed that Gml29 physically interacts with core clock genes 
and regulates the molecular oscillator. In addition, Gml29 
oscillates in multiple tissues, functionally represses the activity of 
the CLOCK/BMALl transcriptional complex, and most impor- 
tandy, influences the free-running circadian period of locomotor 
activity in mice. In view of its role as a computationally /zighlighted 
repressor of the wetwork oscillator, we have renamed the gene 
Chrono. 



Results and Discussion 

In order to identify novel "core clock genes," we considered 
physiologically relevant features that define core circadian com- 
ponents: (1) Core clock components cycle with a ~24-h period. (2) 
Core clock gene mutation or knockdown affects circadian behav- 
ioral rhythms. (3) Core clock genes interact with other core clock 
genes. (4) Core clock genes are expressed in most tissues. (5) Core 
clock genes are phylogenically conserved between vertebrates and 
flies. 

Importantiy, as is demonstrated by our exemplar set of known 
clock genes (Figure lA), none of these features are absolute require- 
ments: The canonical circadian gene Clock, for example, does not 
cycle robustly in the pituitary [25] or SCN [26] . Individual knock- 
down of either the Mrldl or Nrld2 genes has minimal phenotypic 
effect [27]. Rather, these features lie on a continuum, each lending 
some support to a given gene having a core circadian function. 

We used published, genome-wide datasets that provide infor- 
mation on each of these features and developed simple, albeit 
imperfect, metrics to quantify each feature. These metrics were 
designed to reward clock-like features. 

Core Clock Metrics 

Cycling. In order to assess transcript cycling, we reanalyzed 
high-resolution time course microarray data for liver, pituitary, 
and NIH 3T3 cells [25] . As detailed in the Materials and Methods 
section, we combined the p values obtained by evaluating cycling 
in each tissue to create a single general cycling metric (McyJ for 
each gene. Higher values of Mcyc correspond to more robust 
cycling in this combination of tissues. Compared to nonclock genes 
(Figure IB, Left), the distribution of Mcyc among the exemplar 
clock genes (Figure IB, Center) is shifted far to the right with clock 
components demonstrating more robust cycling. Intuitively, a very 
high value of Mcyc provides some suggestion that a gene may 
belong to the set of core clock genes. 

Phenotype. We used data from a genome-wide RNA inter- 
ference (RNAi) screen identifying in vitro circadian modulators [28] 
to generate a circadian disturbance metric (Muist). By construc- 
tion, larger values of Mui^t reflect greater influence on in vitro 
rhythms (Figure IC). In comparison to nonclock genes, the 
distribution of Mnist among core-clock genes is shifted to the right 
with clock genes demonstrating more impact on ceUular circadian 
phenotypes. Interestingly, the most extreme values in Muist did not 
result from the knockdown of known clock genes. A second, small 
mode of extreme Mnist values was observed in the screen and may 
have resulted from knockdowns that nonspecifically affected 
ceUular health [28]. 

Network interactions. A genome-wide database of func- 
tional genetic interactions inferred from radiation hybrid mapping 
was used to count the number of connections between each gene 
and the exemplar set of clock components (Mi^t) [29]. The distri- 
butions of Mint within the genome at large and among the exemplar 
set of core clock components are shown in Figure ID. Clock genes 
form a tightly connected network with core clock genes being more 
likely to have functional connections to other clock genes. 

Ubiquity. We counted the number of tissues in which each 
gene has been definitively identified via Expressed Sequence Tags 
(ESTs) [30] . The plurality of nonclock transcripts are detected in 
only 1-2 tissues, and less than half of all transcripts have been 
found in 15 or more murine tissues. In comparison, the exemplar 
core clock genes are more widely expressed (Figure IE). 

Phylogenic conservation. For each included gene, we uti- 
lized the Homologene database [3 1] to determine if an annotated 
Drosophila melanogaster homologue has been identified. While this 
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Figure 1 . Integration of core clock features. (A) List of exemplar core clock genes used as example models of core clock components. (B-E) Metric 
functions describing core clock features were generated from published data. Distributions of these metrics among nonclock genes (left panel) and 
exemplar clock genes (center panel) were used to construct evidence factors (right panel). (B) Cycling was evaluated using time-course microarray data 
from liver, pituitary, and NIH 3T3 cells. (C) Circadian disturbance metric quantifies the influence of RNAi-mediated gene knockdown on circadian 
dynamics in the U20S model system. (D) The interaction metric counts the number of interactions inferred between each gene and the exemplar set of 
core clock genes. (E) The tissue ubiquity scores were taken from an EST database. (F) List of 20 genes most likely to have a core circadian function as 
determined by evidence factor integration. Genes highlighted in blue were included in the exemplar training set. Genes highlighted in purple were not 
in the training set but have been identified as having a role in the circadian clock. Gml29 was selected for further characterization. 
doi:10.1371/journal.pbio.1001840.g001 



feature was included in the final model, there was only a small 
difierence between the fraction of clock genes possessing Drosophila 
homologues and the fraction of nonclock genes possessing such 
homologues. The modest fraction of exemplar genes with anno- 
tated homologues and this small difference likely reflects the strict 
criteria used in constructing the Homologene database and may 
underestimate the value of this feature in the ultimate weighting. 

Creation of Circadian Evidence Factors 

Using the above empirical distributions and a modified version 
of the Naive Bayes learning algorithm, we quantified the evidence 
provided by each feature that a given gene is a member of the 



core circadian network [22,32]. We relied on the prior assump- 
tion, informed by experimenter judgment, that increasing 
possession of each of these features lends increasing evidence of 
a role in the circadian clock. We used the empirical cumulative 
distribution function (ECDF) describing exemplar "clock genes" 
or "nonclock genes" to estimate the probabilities that a randomly 
selected "clock gene" or "nonclock gene" would possess a metric 
value at least as extreme as the one observed. We term the ratio 
of these probabilities a "circadian evidence factor" (Materials 
and Methods, Eq. 4). The evidence factors arising from particu- 
lar features and metric value are shown in the right panels of 
Figure IB-E). 
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The evidence amassed from all five features is encapsulated by a 
"combined evidence factor." Computation of combined evidence 
factors requires knowledge of the joint cumulative probability 
distributions for these features among both "clock genes" and 
"nonclock genes." These joint cumulative distribution functions 
are "learned" from the examples under the "Naive" assumption of 
conditional independence. Evidence factors from each individual 
feature are multiplied to calculate the combined evidence factor 
(Materials and Methods, Eq. 6). This approach differs from the 
standard Naive Bayes statistical learning approach only in that 
cumulative distribution functions are used rather than probability 
density functions. 

We ranked genes based on this combined evidence. The top 20 
candidates (Figure 1 F) include 10 of the exemplar clock compo- 
nents along with Tef [33] and Mfil3 [34] , two genes with established 
circadian functions. Moreover, Weel, a canonical cell cycle gene, is 
known to be regulated by the circadian clock [35]. Although, to 
our knowledge, the hypothesis that Weel directly regulatc-s clock 
function has not been tested. Inspecting the top 50 ranked genes, 
several other genes known to be involved in the circadian clock- 
works appear. These include Dbp [36], Insig2 [37], axiA Nampt [38]. 

Evidence Factors Predict Circadian Function 

In order to evaluate the utility of this ranking in the discovery 
of novel clock genes, we applied 10-fold cross-validation. We 
sequentially removed all possible pairs of clock components from 
the exemplar distribution, ignoring our prior knowledge of their 
role in orchestrating circadian rhythms. In each case, we then 
recomputed the combined evidence factors based on this reduced 
knowledgebase and tested our ability to "rediscover" these clock 
genes using different ranking cutoffs. Based on this analysis, we 
estimate that ~50% of true clock components would be recovered 
by screening the top 50 genes (Figure SI A). We also compared the 
use of evidence factors with two prepackaged machine learning 
algorithms. Using the same features, we ranked genes using a 
Gaussian Naive Bayes classifier and a Flexible Naive Bayes clas- 
sifier [39] . The three methods all yield comparable performances 
using cutoffs less than ~ 1,000, but the evidence factor method 
outperforms the other two beyond this point. Importantly, the top 
candidates from all three methods show a very high degree of 
overlap (Figure SIB). 

Only rankings from the evidence factor approach were used in 
selecting genes for further screening. However, results from all 
three probabilistic learning methods are presented in the Sup- 
porting Information section. The cycling feature makes the largest 
single contribution to the combined evidence factors, but it does 
not completely dominate this ranking. Hundreds of genes demon- 
strate strong cycling in the tissues analyzed and other features 
determine the relative ranking among these. Moreover, some 
candidates, like Hdacll, are largely prioritized based on the 
combined strength of other features. 

Given the rarity of bona fide clock genes, any method that is not 
100% specific will result in a number of false positives. As the 
ranking cutoff is increased, the number of nonclock genes incor- 
rectiy identified will also increase. As in other screening appli- 
cations where one is searching for a "needle in a haystack," a 
secondary validation of candidates is needed. Assuming different 
numbers for the total number of core clock components, we estimated 
the false positive rate for different screening cutoff's (Figure SIC). 

The ultimate value of this approach wiU be determined by its 
ability to identify previously unrecognized clock components. We 
tested the top 25 novel candidates for physical interactions with a 
subset of proteins from the negative arm of the molecular clock 
(BMALl, BMAL2, CLOCK, NPAS2, CRYl, CRY2. PERI, 



PER2, and PER3). Three of these candidates (Gml29, Ifitml, and 
Cbs) demonstrated both physical binding with at least one of the 
included clock components and a statistically significant change in 
circadian reporter period after knockdown in the NIH 3T3 model 
system (Figure S2). Of note, although Gml29 might have been 
identified simply by its strong cycling, Cbs and Ifitml are identified 
by virtue of a combination of features. Bellow we present a more 
detailed investigation of the previously uncharacterized candidate, 
Gml29, here renamed Chrono. These data show that Chrono meets 
the formal definition of a mammalian circadian clock gene. 

Chrono mRNA Cycles in Multiple Tissues 

Our previous microarray data suggested that Chrono expression 
cycles with a 24-h period in liver, pituitary, and NIH 3T3 cells 
[25,40]. We used quantitative PGR (qPCR) to confirm cycling in 
the liver and further evaluated transcript cycling in skeletal muscle 
and white fat (Figure 2). The circadian oscillations in Chrono expres- 
sion are of a similar magnitude to those observed for known clock 
factors Nrldl and Per2. Consistent with our results, temporal 
profiling in rat skeletal muscle [41] and lung [42], as well as mouse 
SCN [43], also revealed daily oscillations in Chrono expression. 
Several genome-wide, ChlP-seq studies in mouse liver [43-45] have 
identified the E-boxes in the Chrono gene promoter among those 
genomic regions most tightly bound by BMALl protein. Time 
course microarray studies from SCN and liver demonstrate that 
Chrono expression is reduced in Clock mutant animals and loses 
circadian rhythmicity (Figure S3A) [46,47]. Moreover, Chrono 
expression becomes arrhythmic in the livers of Cryl I Cry2 double 
knockout animals (Figure S3B) [48]. In total, Chrono demonstrates 
robust circadian expression in multiple tissues and appears to be 
directiy regulated by the molecular clock. 

Chrono Physically and Functionally Interacts with the 
Circadian Clock 

We employed a mammalian two-hybrid screen to identify phy- 
sical interactions between CHRONO and a subset of known clock 
components. As expected, many core clock proteins physically 
interacted, as indicated by specific activation of a UASiLuc 
reporter in transfected Human Embryonic Kidney 293 cells 
containing the SV40 T-Antigen (HEK 293T) (Figure 3A, Table 
SI). Interactions between CHRONO and both BMALl and 
PER2 were also observed, with >20-fold induction of luciferase 
activity. BMALl-GHRONO and PER2-CHRONO complex 
formation were confirmed through co-immunoprecipitation (co-IP) 
(Figure 3B and C). Bi-molecular Fluoresc(;n('e Complementation 
(BiFC) using Venus, an enhanced yellow fluorescent protein (YEP), 
was then used to map BMALl/CHRONO interactions to cell nuclei 
(Figure 3D). Notably, when S-tagged CHRONO was overexpressed 
widi both BMALl and CLOCK BiFC fusion proteins, CHRONO 
appeared to colocalize with the CLOCK/BMALl heterodimer in 
nuclear bodies, suggesting that CHRONO continues to interact 
with BMALl while part of this functional circadian complex. 

To evaluate the functional consequences of these physical 
interactions, we monitored P«ri:luciferase activity in unsynchro- 
nized HEK 293T cells transientiy transfected with Cbck/Bmall. 
Perl-\uc reporter activity is enhanced by Clock/ Email transfection 
but repressed by tlu' o\-crcxprc'ssi()n of either Cryl or Chrono 
(Figure 3E). As has been pre\d()usly demonstrated, CLOCKH360Y 
and BMALl G612E missense mutants are resistant to CRY- 
mediated repression [49]. In contrast, CHRONO-mediated 
repression is unaffected by these point mutations (Figure 3E). The 
same pattern was observed in the expression of Nrldl, an 
endogenous CLOCK/BMALl target (Figure S4). Alternatively, 
CHRONO knockdown augments Pisri:luc reporter activity 



PLOS Biology | www.plosbiology.org 



4 



April 2014 I Volume 12 | Issue 4 | e1001840 



Machine Learning Helps Identify A New Clock Component 



Liver 



60 



50 - 



> 

«2 40 
< 

z 

cc 
E 

0) 

> 

"■5 20 
0) 

10 



30 



— ♦— Chrono 
- • ■ Per2 

" -*- Nr1d1 


























J. A 





B 



18 20 22 24 26 28 30 32 34 36 38 40 
CT (hrs) 

Skeletal Muscle 



35 





30 


lit 




> 


25 






NA 


20 


te. 




E 


15 




> 






10 


V 




a. 






5 




0 

1 





Chrono 
- • • Per2 
-^Nr1d1 




























X ^ 1 






r J 


1 / * 




V V 


* 


y . ' i — r 






' — ■ I - -t- - « - 



70 
60 

I 

' 50 

I 

' 40 
30 
20 
10 



18 20 22 24 26 28 30 32 34 36 38 40 
CT (hrs) 

White Adipose 



-»— Chrono 
■ • . Per2 

^Nr1d1 




18 20 22 24 26 28 30 32 34 36 38 40 
CT (hrs) 

Figure 2. Chrono transcript demonstrates circadian oscillations 
in peripheral tissues. qPCR was used to measure transcript 
abundance of Chrono, Per2, and Nrldl in (A) liver, (B) skeletal muscle, 
and (C) adipose tissue. Circadian variation is observed in each tissue 
with the amplitude of Chrono oscillations comparable to that of Per2 
and Nrldl. Data shown are the average of 3-4 biological replicates. 
doi:1 0.1 371/journal.pbio.1 001 840.g002 



(Figxire 3F).These data suggest that CHRONO and CRYl have 
distinct binding sites and/ or functional mechanisms. 

Endogenous Chrono Expression Modulates in Vivo 
Circadian Oscillations 

Small interfering RNA (siRNA) mediated knockdown of 
ClorfSl, the human homologue of Chrono, markedly dampened 
circadian oscillations in a genome-wide circadian screen [28]. Using 
NIH 3T3 cells expressing a BmaU:dLuc reporter as a second model 
system, we tested the effects of four different short hairpin RNA 
(shRNA) constructs that reduced Chrono transcript expression and 
protein abundance (Figure S5). Comparing the pooled results to 
control demonstrates that Chrono knockdown reduces amplitude and 
increases circadian period (Figure 4A-F). To definitively establish 
the role of CHRONO in modulating circadian behavior, we 
obtained transgenic mice from the Knockout Mouse Project [50]. 
These mice incorporate a transgenic construct (Figure S6A) whereby 
the Chrono encoding region is flanked by Lox-P sites {Chromf"'^'^ 
and utilizes a "knockout-first" cassette [5 1] . The transgenic allele is 
a knockout at the level of RNA processing. We mated heterozygous 
transgenic mice to obtain homozygous Chrono knockout mice 
(ChronJ^'^'^, wild-type littermate controls {Chrond^'^), and hetero- 
zygotes ifihrmvi'^''^. qPCR confirmed that, when compared to wild-type 
littermate controls, mRNA expression was halved in heterozygotes 
{ChronJ^'^^) and abolished to basal levels in homozygote knockouts 
[ChronJ^'^-^') (Figure S6B and C). As shown in Figure 4G and H, 
wild-type, heterozygous, and homozygous knockouts were all well 
entrained to the 12:12 fight:dark (L:D) cycle and maintained a 24-h 
period. Under free-running conditions, homozygous Chrono knock- 
outs exhibited a statistically significant (/)<0.05) ~25-min increase 
in circadian period as compared to wild- type controls (Figure 41). 
Heterozygous knockouts display an intermediate period. The mag- 
nitude of this period change is similar to that observed in Clock 
(-20 min) [52], Perl (-40 min) [53], Per3 (-30 min) [54], Nrldl 
(-20 min) [6], Rorh (-25 min) [55], and Npas2 (-12 min) [56] 
knockout animals. These data strongly suggest that endogenous 
Chrono expression plays an important regulatory role in the 
mammalian circadian clock. 

Light, however, does not appear to direcdy influence CHRONO 
expression in the SCN. The SCN microarray data of Jagannath et 
al. does not reveal a significant change in Chrono expression follow- 
ing a nocturnal light pulse [5 7] . Moreover, in our own experiments, 
the phase shifting response of Chrono knockout mice to light pulses at 
ZT 1 6 or ZT22 are not significantly difierent from control. Thus the 
primary role of CHRONO in the circadian clock appears to be in 
modulating core oscillator function and output timing rather than 
oscillator entrainment. 

CHRONO Binds the C-Terminal Region of BMAL1 

In a recent report, BMAL2 was shown to function as a tissue- 
specific paralogue of BMALl [58]. However, CHRONO specif- 
ically binds BMALl and not BMAL2 (Figure 3A). Moreover, 
CHRONO functionally represses the transcriptional activity of the 
BMALl /CLOCK complex but not the activity of the BMAL2/ 
CLOCK complex (Figure 5A). 

In order to identify the region of BMALl required for CHRONO 
binding, we generated mutant BMALl proteins with truncated N- 
or C-terminal regions (BMALl '""'^^^ BMALl '"^*'^) (Figure 5B) 
and tested their interaction with CHRONO using the mammalian 
two-hybrid assay. Deletion of the C-terminal domain of BMAL 1 
(BMALl '"**-^) completely abolished CHRONO binding, whereas 
deletion of the N- terminal domain (BMALl '''"''^'') had no effect 
(Figure 5C). We next exploited the strong sequence homology 
between BMALl and BMAL2 to localize the CHRONO binding 
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Figure 3. Physical and functional interactions of CHRONO. (A) Results from a matrix of mammalian two-hybrid assays between known 
circadian clock components fused to Gal4 DNA binding domain (Gal4 DBD) or VP16 activation domain (VP16 AD). Black and gold indicate bait-prey 
interactions that resulted in less or greater than 5-fold activation of the 4XUAS reporter, respectively. Co-IP with tagged CHRONO confirms complex 
formation with (B) BMAL1 and (C) PER2. (D) C- and N-terminal regions of Venus, an enhanced florescent protein, were fused with identified constructs. 
A yellow bi-molecular fluorescence signal identifies interactions. (E) HEK 293T cells were transiently transfected with a Perl :luc reporter, wild-type, or 
mutant Bmall/Clock, and increasing amounts of Cryl or Chrono. BIVIALI/CLOCK point mutants are resistant to CRYl -mediated repression but sensitive 
to CHRONO. (F) The ability of native CHRONO to repress BMAL1 /CLOCK activity was determined by transient transfection with two distinct shRNA 
constructs directed against Chrono. The indicated plasmids were co-transfected with the Perl-luc reporter into HEK 293T cells. Average activities and 
standard deviations from reporter assays were determined from independent biological triplicates. 
doi:l 0.1 371/journal.pbio.1 001 840.g003 



site within the BMALl C-terminal region. We swapped corre- 
sponding sections of the BMALl and BMAL2 C-terminal domains. 
As expected, the construct containing the N-terminal of BMALl 
and the fuU C-terminal of BMAL2 (BMALl -BMAL2) did not 
interact with CHRONO in the two-hybrid assay and was relatively 
immune to CHRONO-mc-diatcd repression (Figures 5C-E). A 
chimeric protein including the N-terminal region of BMAL2 with 
the longer BMALl C-terminus (BMAL2-BMAL1#1) interacted 
with CHRONO and phenocopied wild-type BMALl with regard 
to CHRONO-mediated repression (Figure 5D-F). Sequence 
alignment between C-terminal domains of BMALl and BMAL2 
reveals a region of poor alignment (514-594). Insertion of this 
unique region of the BMALl protein (514-594) into BMAL2 C- 
terminus rendered the chimeric protein (Bmal2-Bmall#2) respon- 
sive to CHRONO-induced repression (Figure 5E-F). This 
CHRONO binding region is adjacent to, but distinct from, the 
CRYl interacting terminus [59]. Thus, CHRONO functions as a 
specific transcriptional co-repressor of BMAL 1 through interaction 
with a unique C-terminal domain adjacent to the CRYl binding 
region. This domain is both necessary and sufiicient for physical 
and functional interactions with CHRONO. 

CHRONO Abrogates CBP/BMAL1 Binding 

Previous studies suggested that CBP also binds to the BMALl 
C-terminus [59,60]. Thus, we hypothesized that CHRONO might 
interfere with BMALl-CBP binding. We generated plasmids 
encoding BMALl and CBP fused to the C- and N-terminal 
regions of the Venus YFP. We then utilized BiFC to visualize 
BMALl-CBP interactions in HEK 293T cell nuclei. BMALl- 
CBP complex formation induced a yellow BiFC signal (Figure 6A). 
Co-expression of native or S-tagged CHRONO severely damp- 
ened BMALl-CBP complementation. Western blotting (Figure 
S7A) confirmed stable abundance of BMALl and CBP proteins, 
implicating altered binding as the source of the reduced BiFC 
signal. Lastly, the ability of CHRONO to interfere with BMALl- 
CBP binding was verified by co-IP analysis showing that over- 
expression of intact CHRONO reduced BMALl-CBP complex 
formation (Figure 6B). 

A functional impairment in the ability of BMALl to recruit 
CBP is expected to reduce histone acetylation of CLOCK/BMALl 
target regions. To assess the influence of CHRONO on the histone 
acetyl-transferase activity of the BMALl /CLOCK complex, we 
performed a ChIP study using an antibody targeting acetylated 
histone H3 lysine 9 (H3-K9) (Figure 6C). PCR was used to 
specifically evaluate H3-K9 acetylation near the Perl promoter 
E-box. Control samples obtained from immortalized human oste- 
osarcoma (U20S) cells 24 and 36 h after dexamethasone synchro- 
nization demonstrated a temporal variation in target acetylation. 
U20S cells overexpressing CHRONO demonstrated a blunt(-d 
temporal profile in Perl promoter H3-K9 acetylation, with loss of 
the increased acetylation normally observed 24 h after synchroni- 
zation [61,62]. 

In order to confirm that abrogated CBP/BMALl binding con- 
tributes to the CHRONO-mediated modulation of circadian 



dynamics, we constructed several CHRONO truncation mutants. 
AH constructs that retained the 108-212 region reduced BMALl- 
CBP binding as assessed by BiFC (Figure 6D and E). As has been 
previously demonstrated, overexpression of CBP, along with BMALl 
and CLOCK, enhances Perlduc expression in unsynchronized cells 
(Figure 6F). Those same CHRONO constructs that abrogated 
BMALl-CBP complex formation also repressed CBP-enhanced 
Perliluc reporter activity (Figures 6E and F and S7B). This pattern 
of activity among CHRONO truncation mutants was further 
mirrored in their ability to colocalize with BMALl (Figure S7C). 
Stable expression of the constructs in synchronized cells reveals the 
same pattern in their abilit}' to modulate circadian reporter 
expression (Figure S7D and E). Thus, the abrogation of the 
BMALl-CBP binding provides a plausible mechanism whereby 
CHRONO might influence circadian dynamics. 

Conclusions 

In summary, our data demonstrate that Chrono (i) oscillates with 
a circadian frequency in multiple tissues, (ii) physicaUy interacts 
with BMALl and PER2, (iii) specifically reduces BMALl/ 
CLOCK-mediated transcription independently of CRYl, (iv) 
affects the free-running circadian period of mice, and (v) interferes 
with BMALl-CBP binding, functionally repressing the CLOCK/ 
BMALl complex and modulating the circadian acetylation of 
target genes. Most importantly, CHRONO knockout mice display 
a long free-running circadian period simflar to or more drastic 
than six other clock components. These data establish a role for 
Chrono in the mammalian circadian oscillator. Like CIPC [18], 
CHRONO appears unique to the vertebrate genome. Given the 
repressive function of CRY proteins, the evolutionary develop- 
ment of additional CLOCK/BMALl repressors in vertebrates 
highlights the importance of fine control of circadian rhythms. 
Transcriptional oscfllations can differ in their amplitude, frequen- 
cy, phase, basal expression, and waveform shape. The ability to 
independentiy control these characteristics likely requires multiple, 
tunable genetic parameters. The specificity of CHRONO-mediated 
repression for BMALl over BMAL2, along with tissue-specific 
variation in the expression of BMALl and BMAL2, may thus 
facilitate local tuning of circadian oscillations. 

Of course, there remain important, unanswered questions with 
regard to the function of CHRONO in modulating circadian 
dynamics. Although the abrogation of BMALl-CBP is a plausible 
mechanism for CHRONO-mediated repression, it may reflect 
only part of its circadian function. Moreover a nuanced under- 
standing of how this repression leads to a period-lengthening 
phenotype in the knockout animal wifl likely require a greater 
understanding of kinetics and network compensation. Our BiFC 
data demonstrate that overexpressed CHRONO co-localizes with 
the CLOCK/BMALl complex in nuclear bodies. It was previ- 
ously shown that BMALl recruits CBP primarily when localized 
to promyelocytic leukemia (PML) nuclear bodies [63]. Thus the 
interruption of the CBP-BMALl binding within these nuclear 
structures is consistent with the potent repression induced by 
CHRONO overexpression (Figure 3E). Indeed, while this work 
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Figure 4. Influence of CHRONO on in vitro and in i/'/Vo rhythms. (A-D) Raw bioluminescence data from NIH 3T3 fibroblasts expressing BMALdLUC 
reporter are plotted after transfection with four shRNA constructs targeted against Chrono. Control and Chrono knockdown tracings are depicted in blue 
and red, respectively. Two replicates are shown. The period (E) and amplitude (F) of the observed rhythms are plotted. Representative wheel-running 
activity records for (G) wild-type control and (H) Chrono""''"'' knockout mice. Blue shading indicates light exposure during the initial 12:12 h, L:D cycle. 
Arrows indicate transition to constant darkness. Regression lines fit to activity onset and computed period are shown. (I) Periodogram estimates of 
observed periods from wild-type (n = 5), Chrono"'^* {n = 8), and Chrono"'^"" mice (n = 6). Error bars indicate standard error of the mean. 
doi:10.1371/journal.pbio.1001840.g004 
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Figure 5. CHRONO interacts with the C-terminus of BMALI but not BMAL2. (A) Overexpression of either BMAL1 or BMAL2, along with 
CLOCK, activates Per/iLuciferase reporter activity. Both are repressed by overexpression of CRYI. CHRONO specifically represses BMALI-induced 
reporter activity. (B) BMALI and BMAL2 have similar structures with conserved bHLH DNA binding domains and PAS A and B interaction domains. 
BMALI contains a unique C-terminal region. Chimeric proteins were constructed by swapping corresponding domains from each protein as shown. 
Two-hybrid screening in HEK 293T cells demonstrates that BMALI truncation mutants (C) and chimeric proteins (D) that contain the 487-586 region 
of BMALI bind CHRONO and induce UAS:Luc reporter expression. This region is adjacent to but distinct from the annotated CRYI binding site. (E) All 
BMALI -BMAL2 constructs induce PerlAuc reporter activity in HEK 293T cells. In all constructs, reporter signal is repressed by the addition of CRYI. 
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Functional repression by CHRONO is limited to BMAL constructs containing the implicated binding domain. (F) In cells overexpressing MYC-CHRONO 
along with BMAL1, BMAL2, or a chimeric BI\/IAL2-BI\/IAL1 construct, co-IP confirms complex formation between CHRONO and proteins containing the 
implicated BIVIALI C-terminal region. 
doi:l 0.1 371/journal.pbio.1 001 840.g005 



was in revision, Annayev et al. [64] also reported that CLOCK/ 
BMALl transcription is efficiently repressed by CHRONO(GMl 29). 
Our experimental work adds both a description of the circadian 
locomotor phenotype of the Chrono knockout mouse and an 
understanding of the mechanism by which this repression is mediated. 

Although our work focused on the interaction between CHRONO 
and BMALl, CHRONO might also influence circadian physiology 
through its interaction with PER2. ll ^\•as recently reported that 
PER2 also localizes to PML nuclear bodies [65]. The importance of 
CHRONO/PER2 binding (Figure 3A,C), both within this complex 
and more generally, remain unexplored. PER2 not only binds with 
cryptochromes but also interacts with nuclear receptors NRIDI 
and RORA [66]. Our preliminary tests (Figure S8) show that 
overexpression of CHRONO enhances die PER2/NR1D1 com- 
plex formation. The recruitment of this established circadian repres- 
sor provides another mechanism for CHRONO-enhanced repression 
of the circadian network. The importance of CHRONO/PER2 
binding and a broader analysis of the role of CHRONO in the 
circadian network will require further study. 

The extent to which CLOCK can recruit CBP/P300 indepen- 
dently of BMALl also remains unclear [67]. Given the highly 
redundant structure of the circadian oscillator [68], the ability of 
CLOCK to recruit a co-activator hints that tlu-rc may be a 
functional paralogue of CHRONO acting on the other half of the 
BMALl /CLOCK complex. Perhaps most importantly, the 
knockout and targeted disruption of several other clock factors 
have been shown to not only influence circadian period but also 
downstream physiological changes in metabolism [69] and sleep 
homeostasis [70,71]. More detailed phenotyping of CHRONO 
knockout mice will be required to identify any such deficits. 

Machine learning has recendy been applied to complex biological 
problems including drug discovery [72], protein translation [73], 
and gene interaction networks in yeast [74]. We used a simple form 
of probabilistic machine learning to integrate sparse existing data 
whose joint distribution is hypothesized to yield a more specifii: 
ranked list of candidate genes. Although foUow-up experimentation 
is an important part of this process, the identification of Chrono 
reflects the ability of this approach to find genes regulating circadian 
behavior. To our knowledge, this is the first application of these 
methods to identify genes responsible for complex neurological 
behaviors. We anticipate that the investigation of other candidates 
win advance the understanding of circadian rhythms. Indeed, in 
addition to CHRONO, our initial screening of the top 25 novel 
candidates identified two other proteins that both bind clock 
components and modulate in vitro circadian oscillations. To facilitate 
the experimental characterization of these and other candidates, a 
more exhaustive candidate ranking is provided in Table S2. As bona 
fide clock components are discovered and high-quaKty datasets 
become available, exemplar distributions can be re-evaluated and 
feature metrics can be improved. Thus, this integrated computa- 
tional and experimental approach presents a path for leveraging 
genome scale data to develop insight into circadian biology. 

Materials and Methods 

Ethics Statement 

All animal experiments were performed with the approval of the 
Institutional Animal Care and Use Committee (lACUC Protocol 
Numbers 801906 and 803945). 



Informatics 

Unless otherwise specified, all computations were done in the R 
programming environment [75]. 

Metric Function Construction 

Cycling. Time-course datasets spanning 48 h with a 2-h 
sampling frequency obtained from pituitary, liver, and NIH 3T3 
cells [25] were separately normalized using the GCRMA function 
(bioconductor package) [76] . The R implementation of JTK_cycle 
[77] was apphed to each tissue-specific dataset, and the p value 
describing the probability of observing the given data under the 
nuU hypothesis of nonperiodic behavior was obtained. The cycling 
metric was computed from the product of the three p values: — 
log(/'LiverX/'PituitaryXjf^iH STs)- Thus, tiic cycliug mctiic does not 
simply assign a gene as "cycling" or "noncycling" but provides a 
continuous measure reflecting the robustness of cycling in several 
tissues. 

Circadian influence. Screen methods and initial processing 
were presented previously [28]. In brief, each gene was targeted by 
two distinct pools of siRNA constructs. Two replicate wells were 
utilized for each siRNA pool. Kinetic luminescence readings were 
fit sinusoidal waves to obtain an amplitude and period for each 
well. TIk; log ratio between target and txjntrol circadian 
parameters was provided by the study authors [28]. Separate log 
ratios were computed for period and ampfitude parameters. For 
each gene, the siRNA pool that induced the greatest magnitude in 
log change was used for further analysis. The z scores for the 
induced amplitude (Z^„y,) and period (Zp„,;rf) changes, in compar- 
ison to all other targeted genes, were computed. The circadian 
influence metric was computed as \Ab.'i(Z f,„„„jj+Abs{Zji„i^. 

Interaction. The supplementary table providing the fuUy 
connected genetic interaction network was obtained from [29]. 
For each gene, the number of interactions with the exemplar clock 
list was tabulated. Only nonself interactions are included. 

Ubiquity. Tissue ubiquity scores, which equal the total 
number of distinct murine tissues in which ESTs for a gene had 
been identified, were obtained from the authors of [30] and used 
as the ubiquity metric. Unlike the other features, a single cutoff 
value was used to discriminate the likelihood that a given gene 
might be a core circadian component. The cutoff was determined 
receiver operator curve analysis, selecting the point on the curve 
with maximal distance from the line of identity [78]. 

Homologene. The set of all gene groups in the Homologene 
database (Build 66) that have mouse and human homologues was 
used to represent mammaUan genes [31]. For each of these 
Homologene groups, we looked to see if a Drosophila melanogaster 
homologue was identified. 

Identifier mapping. Gene identifiers used from the various 
component datasets were all mapped onto Homologene identifiers 
using the flat file from the Homologene database (Build 66) [31]. 
Identifiers that were not listed in the Homologene database were 
submitted to the NCBI biological database network for mapping 
to the appropriate Homologene identifier [79]. Data associated 
with gene identifiers that remained unmapped after both attempts 
were ignored for further analysis. 

Evidence Factor Derivation 

The derivation of circadian evidence factors closely follows that 
for Bayes factors [32], and our strategy follows the Naive Bayes 
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Figure 6. CHRONO interferes with BMAL1 -CBP binding. (A) BiFC was used to observe BMALI -CBP interactions in the nuclei of HEK 293T cells. 
Co-expression of intact or S-tagged CHRONO reduced the complementation signal. Expression of the 212-385 CHRONO truncation mutant had no 
discernable effect. (B) IP confirms CHRONO-mediated interference in BMAL1/CBP complex formation. Endogenous protein was immunoprecipitated 
with anti-CBP antibody followed by immunoblotting as indicated. (C) ChIP qPCR analysis was used to evaluate the effect of CHRONO on the 
acetylation of histone H3-K9 near the Pert promoter E-box region. Schematic diagram of the human Per? promoter and primers used for ChIP assay 
are shown. Lysates obtained from control U20S cells and those stably expressing CHRONO were collected 24 and 36 h after dexamethasone 
synchronization. ChIP DNA samples were quantified by quantitative real-time RT-PCR. Data are mean ± standard error of biological triplicates. (D) 
Various S-tagged, N-, and C-terminal CHRONO truncation mutants were generated. (E) Percent of cell nuclei demonstrating complementation after 
overexpression of various CHRONO constructs. (F) Per1:luciferase reporter signal in unsynchronized cells overexpressing BMALI /CLOCK is enhanced 
by the transient overexpression of CBP. The effect of the overexpression of CHRONO constructs on reporter activity is shown. 
doi:1 0.1 371/journal.pbio.1 001 840.g006 
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Classifier approach of "learning" the feature distributions from the 
training data. We considered an individual feature described by 
metric x, and a single arbitrary gene with observed metric value D. 
The event space is divided in two disjoint events: x>D and x<D. 
These events correspond to a randomly selected gene having a 
metric value at least as extreme as D or the randomly selected gene 
having a metric value less than D. The events are labeled a and b, 
respectively. The use of an interval rather than a point allows us to 
regularize the sparse empirical data for the estimation. Each gene 
is assumed to belong to either the set of clock genes {Cgene) or the 
set of nonclock genes (NCgene). 
By Bayes' Theorem: 



P(gene e Cgene\a) -- 
and 



P{a\gene e Cgene)P{gene 6 Cgene) 



(El) 



P(gene e NCgene\a) = 

P(a\gene e NCgene)P{gene e NCgene) 



Dividing (El) by (E2) yields: 

P(gene e Cgene\a) P(a\gene e Cgene) 
P(gene e NCgene\a) 



(E2) 



P{a\gene 6 NCgene) 

P(gene e Cgene) 
P{gene e NCgene) 



(E3) 



Substituting the definition of «, the middle term of (E3) becomes: 
P(x>D\gene e Cgene) 



P(x > D\gene e NCgene) 



(E4) 



The left-hand side of (E3) is the posterior odds of a gene being a 
core clock component conditional on observing a metric value 
greater than or equal to D. The last term represents the general 
odds of clock gene membership without additional feature 
information. Thus, the posterior odds of a gene belonging to the 
set of clock genes (given a metric value greater than or equal to D) 
is equal to the product of K and the a priori odds. 

Combined Evidence 

Our analysis included n = 5 clock gene features. For each metric 
Xi, the event space is divided into two disjoint events — x, > Z>, and 
Xi<Di for some Z),- — and these events are labeled a,- and bi, 
respectively. Following the steps above: 



P(gene e Cgene\ai,a2,...a„) 
P(gene e NCgene\a\,a2,...a„) 

P{a\,a2,---a„\gene e Cgene) P(gene e Cgene) 

P{a\,ai,...an\gene e NCgene) P(gene e NCgene) 



(E5) 



The middle term in equation (E5) is the factor by which the a 
priori odds of clock gene membership must be adjusted to recover 
the posterior odds after all of the observed data. It represents the 
combined evidence factor (Kcom) given all five features. Given the 



number of features, the training set of circadian clock components 
is too sparse to approximate the required joint distribution without 
some regularizing assumption. We follow the typical Naive Bayes 
approach and show that, given conditional independence of the 
included features, Kcom is simply the product of the individual 
evidence factors. 

By definition, random variables X\,X2,—X„ with probability 
density functions Pi(Xi) and joint probability density function 
p{x\,X2,---X„) are conditionally independent given a random 
variable Z if and only if: 



p{x\,X2,...x„\Z = z)-- 



n 



Pi{xi\Z = z) for all z. 



Using this definition and the definition of the events a,, the 
denominator of Kcom can be simplified: 

P{a\,a2,...a„\gene e NCgene) 

= P(x\ >D\,X2>D2...x„>D„\gene e NCgene) 



ui U2 "n 

00 00 OO 



Di D2 D„ 

a 

=n:_ 



p(xi,X2,...x„\gene e NCgene)dx\dx2...dx„ 
n: = iPiiXi\gene e NCgene)dxi 



Pi(xi\gene e NCgene)dxi 
n: = 1 P{xi >Di\ gene e NCgene) . 



Similarly, the numerator of K^ 
e Cgene), and the cumulative evidence is equal to 



becomes IT P(Xi>Di\gene 
/—I 



^Yl Pi^i^Di\ge"'^^ Cgene) 

i=\ P(Xi>Di\gene€ NCgene) ^ ' 



Computation of Evidence Factors 

Given either the distribution of metric values among exemplar 
clock genes or the distribution among the genome at large, the 
probabilities of obtaining a metric greater than, or equal to, that 
observed was approximated with the ecdfO function in R. For any 
given gene and feature, the ratio of these probabilities was com- 
puted to obtain the value of K. Metric values greater than the 
maximum value observed among exemplar clock components 
were assigned the same evidence factor as that maximum value. 
Combined evidence factors are the product of the feature-specific 
factors. If no data were available for a given gene and feature, this 
feature was ignorc'd 1)y slotting the corresponding evidence factor to 
be 1 . The ubiquity and homology metrics were both Boolean variables, 
and the standard Bayes factor formula was used for these features. 

Cross-Validation and Method Comparison 

The use of combined evidence factors was compared with two 
prepackaged, supervised machine learning algorithms in the R 
programming environment: a Gaussian/Normal Naive Bayes classifier 
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within the "el071" package [80] and a Flexible Naive Bayes 
classifier [39] within the "klaR" package [81]. Probabilistic 
learning algorithms were preferred as they do not require a prior 
weighting of the importance of the various features [24]. For 
training, genes not in the exemplar clock group were labeled as 
nonclock genes, and the classifier was trained on the entire dataset. 
Genes were rank ordered on the posterior probability of clock gene 
membership after the model was applied to the data. For the 
Flexible Naive Bayes implementation, kernel density estimation 
was performed with the default value for the "window parameter." 
This default uses a heuristic formula to adjust the window of kernel 
density estimate based on the number of data points. 

We sequentially removed all possible pairs of clock components 
from the exemplar distribution and retrained the various learning 
algorithms on the reduced exemplar sets, testing our ability to theo- 
retically recover these known clock genes using different ranking 
cutoffs (Figure S2A). The three methods all had comparable 
performance using cutoffs less than ~ 1 ,000, but the evidence factor 
method outperformed the other two beyond this point. The top 
candidates from all three methods show a very high degree of 
overlap (Figure S2B). We estimated the false discover^' rate (FDR) of 
the Evidence Factor approach by combining the sensitivity analysis 
with an assumed total number of clock components to generate an 
expected number of true and fake positives at different ranking 
thresholds (Figure S2C). 

Supporting Microarray Results in Mutant Animals and in 
Response to Light 

Preprocessed microarray data obtained from WT and Cbck 
mutant animals as reported by Miller et al was downloaded from 
the Circa database [47] and replotted. A single apparent outlier 
from the SCN data (Mutant, original time point 46) is excluded 
from the plot as this value was greater than any other SCN 
expression value from WT or mutant animals, and ~3x the 
replicate measure. Gel files from the Cryl/Ciy2 double mutant 
were obtained from NIH GEO and normalized via GGRMA [76]. 

Exon-array eel files describing the transcriptional response of 
WT and melanopsin knockout animals to sham control and 
following a hght pulse [57] were downloaded from NIH GEO. 
Data were extracted, annotated, quantUe normalized, and log 
transformed at the gene level using the Affymetrix Expression 
Console package (vl.l). The probeset corresponding to Gml29 
was then separately analyzed. In both WT and knockout animals, 
when compared to sham control, Chrono expression did not 
significantly change 30, 60, or 120 min after light pulse. 

Experimental Confirmation of Gml29 Cycling in Tissue 
Samples 

Tissue collection. Six-week-old male G57BL/6J mice (Jack- 
son) were housed in light-tight boxes and entrained to a 12-h light, 
12-h dark schedule for 1 wk before being switched to constant 
darkness. Starting at circadian time (CT) 18, 2-3 mice were 
sacrificed per time point. Liver, white fat, and skeletal muscle 
samples were excised and snap-frozen in liquid nitrogen. 

qPCR. We homogenized 2 mm^ tissue samples in 500 Jtl 
Trizol (Invitrogen) using a TissueLyzer (Qiagen), and total RNA 
was purified using RNEasy columns according to the manufac- 
turer's protocol (Qiag(;n). (Reverse transcription and qPGR were 
carried out as per Baggs et al. [68].) 

In Vitro Function and Binding Experiments 

cDNA and shRNA expression plasmids. Construction of 
plasmids expressing wild-type and CRY-insensitive mutant Bmall 



and CLOCK, and wild-type Qyl, Ciy2, and NPAS2 cDNAs were 
pubUshed previously [49]. CHRONO/ CI Orf51, BMAL2, and MGC 
library cDNAs in Sport6 vector (Invitrogen) were obtained from 
Open Biosystems (HuntsviUe, AL). Perl, Per2, and Per3 cDNAs were 
published elsewhere [5] . 

Mammalian 2-hybrid constructs, two-hybrid reporter plasmid, 
Epitope-tagged cDNAs, and S-tagged CHRONO constructs were 
cloned using standard recombinant genetic techniques. 

Hybrid Bmall and Bmal2 genes were generated by gene splicing 
by overlap extension (SOE) [82] with 50-60 bp primers that 
overlap the junctions of the Bmall / Bmal2 fusions. All Bmall / Bmal2 
hybrid fusions were sequenced to verify that the full-length, 
in-frame fiision was generated. The pGL3P-Peri [17] and 
pGL3Basic-5ma/i [7] reporters are described elsewhere. The 
pGIPZ nonsilencing shRNAmir control and hChrono-directed 
shRNAs # 1 (Ohgo ID V2LHS_1 7058) and #2 (Oligo ID V2LHS_ 
17062) constructs were purchased from Open Biosystems. 

Transient transfections and cell-based reporter 
assays. Ninety-six-weU Perl promoter-luciferase reporter as- 
says in HEK 293T cells were performed as reported elsewhere 
[49] with modification. We cotransfected 5 mg of a Renilla 
luciferase (Rluc) expression plasmid to normalize reporter activity 
for transfection efficiency. We used 50 mg of pGIPZ vector in 
shRNA cotransfections. For mammalian two-hybrid assays, 25 ng 
of pGL4P-4XUAS, 5 ng Rluc, 50 ng pACT, and 50 ng pBIND 
plasmids were transfected into HEK 293T cells in 96-weIl plates as 
previously described [49]. Transfected cells were analyzed after 
24 h incubation for luciferase reporter activity with DualGlo 
luciferin reagent (Promega). 

Stable transgenic cell line creation. Per2-dLuc U20S cells 
were stably transfected with pcDNA3.1 vectors expressing S- 
tagged CHRONO wild-type and truncation mutants. The cells 
were grown with the treatment of selection marker (G418; Invi- 
trogen) for 4 wk. After selection, co-IP and kinetic luminometry 
were performed as described. 

Native co-IPs and Western blotting. Native co-IPs and 
Western blotting of epitope-tagged proteins expressed in HEK 
293T cells were performed as previously described [49]. We 
transfected 3 |J.g of total plasmid DNA per each 10 cm Petri dish 
with 1—1.5 (ig of individual pCMV-Sp()rt6 expression plasmids 
transfected in each condition. For co-IP of Flag-tagged CLOCK 
or BMALl witii Myc-CHRONO, 1 Jig of empty pCMV-Sport 
vector was transfected to normalize transfections with 3 ng total 
DNA. 

Isolation and quantification of RNA levels by real-time 

PGR. HEK 293T cells in 24-well plates at 80'% confluence were 
transfected with 100 ng pCMY-Bmall or Bmal2, 250 ng pCMV- 
CLOCK, and 100 ng of empty, Cryl, or CHRONO expression 
plasmids and FugeneHD (3 nl FugeneHD:l ng plasmid DNA). 
RNA was harv ested 24 h after transfection RNA levels were 
measured by real-time PGR as already described. 

IP analysis for BMALl binding. HEK293 cells were 
transfected with plasmids encoding Flag-BMALl, Flag-CLOCK, 
and S-tagged wild-type and mutant CHRONO constructs as 
indicated in the figures. At 48 h post-transfection, the cell lists 
were harvested in radioimmunoprecipitation assay (RIPA) buffer 
supplemented with a protease inhibitor cocktail (Roche) and cen- 
trifuged at maximum speed for 20 min at 4°C. Equal amounts of 
total protein were incubated with 2 |J.g of anti-S-Tag (Novagen) 
antibody overnight and then to a protein G-Sepharose bead slurry. 
The final immune complexes were analyzed by immunoblotting. 
Immunoblot analyses were performed on 6% or 8% sodium 
dodecyl sulfate polyacrylamide gels and transferred to polyviny- 
lidene difluoride membranes (ImmobUon P; MiUipore). Target 
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proteins were detected with anti-S-Tag (Novagen) and anti-Flag 
M2 (Sigma) antibodies. The immune complexes were visualized 
with HRP-conjugated secondary antibodies and ECL detection 

(Pierce). 

IP analysis for PER2/NR1D1 binding. HEK 293T cells 
were transfected with plasmids encoding PER2-Venu.s, NRIDI- 
Flag, and CHRONO-S as indicated in the figures. At 48 h post- 
transfection, the cells were harvested for IP procedures as 
described above. Immune complexes precipitated after the over- 
night incubation of the cell lysates with 4 |ig anti-GFP antibody 
(Sigma, G1544). Complexes were immunoblotted using anti-GFP, 
anti-S-Tag (Novagen), and anti-Flag M2 (Sigma) antibodies. 

Chromatin IP. Both control U20S cells and those stably 
overexpressing CHRONO were used in ChIP analysis. Lysates 
were obtained 24 and 36 h after dexamethasone synchronization. 
Experimental procedures to prepare chromatin were performed as 
described by Schmidt et al. [83]. The precleared chromatin was 
immunopretdpitated overnight at 4' C by agitating with 5 |J,g of 
anti-acetyl-histone H3 (Lys9) antibody (07-352, EMD Millipore). 
The cell extracts without incubation of antibody were used for 
input control. Immune complexes were collected by incubation 
with protein-G— coated magnetic beads (10004D, Life Technologies) 
and the final eluted DNA was extracted by phenol-chloroform- 
isoamyl alcohol (25:24:1) and ethanol precipitation. The primer sets 
used for ChIP qPCR analysis of human Perl promoter region 
spanning canonical (CACGTG) were as follows: forward primer, 5 '- 
TCTCCCTCTCTCCTCCCTTCC-3'; reverse primer, 5'-GCC- 
TGATTGGCTAGTGGTCTT-3' . 

BiFC and Immunofluorescence (IF) Assays 

C- and N-terminal regions of an enhanced variant YEP called 
Venus were fused with identified constructs. Expression vectors of 
S-tagged full-length CHRONO (1-385), and its various deletion 
mutants were cotransfected with GFP— BMALl expression vector 
or BiFC fusion plasmids encoding VC-B]V'L\L1, CLOCK- VN, or 
CBP-VN. At 16 h post-transfection, the cells were fixed with 4% 
paraformaldehyde in PBS and incubated with anti-S-tag (Bethyl 
Laboratories, Inc.) and anti-hClorfSl (Santa Cruz Biotechnology) 
antibodies, followed by secondary antibodies conjugated to Alexa 
Fluor 568 (Invitrogen). Cells were visualized using fluorescein 
isothiocyanate and tetramethylrhodamine isothiocyanate filters in 
fluorescence microscopy. 

Genotype and Orcadian Phenotype for Chrono Knockout 

Mice 

Generation of mice containing Chrono^'*'^ alleles. The 

Gml29 {Chrono) mouse strain (Gmi29'"'^*^°'^^'^"^') was created 
from embryonic stem cell clone EPD0378_5_B03 generated by 
the Wellcome Trust Sanger Institute and made into mice by the 
KOMP Repository and the Mouse Biology' Program at the 
University of California, Davis. Heterozygous mice {Chrono^^*) on 
a C57BL/6 background were- bred to generate homozygous 
(Chrontl^'J^, WT {Chrono''^ and heterozygous {Chromfi''^ mice. 

Circadian behavioral analysis. Mice were housed in 
individual cages within a temperature- and humidity-controlled, 
light-tight enclosure. Each cage contained a running wheel. Food 
and water were allowed ad libitum. Wild-type (;/, ~ 5), ChronJ^'* 
(« = 8), and QmmJ'^'-^'' mice (« = 6) were entrained to a 12:12 h 
L:D cycle for >2 wk l)eforc being released into constant darkness. 
Locomotor activity monitoring, actogram creation, and period 
calculations were performed using ClockLab Data Collection (Acti- 
metrics). Statistical analysis of period change was done through 
application of both a t test and an alternative, nonparametric 
Mann-Whitney test using the t.testQ and wilcox.testQ functions in R. 



Both tests resulted in a significantiy ((()£. 05) greater period among 
mutant as compared to wild-type mice. 

Phase Response to Light Pulse 

A modified AschofF type II procedure was used, facilitating the 
exposure of animals to light pulses before their free-running rhy- 
thms had drifted apart significantiy [84,85]. Animals were 
entrained to a 12:12 L:D cycle and then placed in constant 
darkness (D:D) prior to a 30-min light pulse. The light pulses were 
initiated at zeitgeber times (ZTs) 16 or 22 on the second day of 
D:D. Animals remained in DD for 7 d following the light pulse. 

Daily activity onset times were determined using ClockLab Data 
Collection software (Actimetrics) and were exported for further 
analysis. The phase response was calculated as the diflference 
between activity onset predictions as determined by prepulse and 
postpulse regression lines computed in R. The prepulse regression 
line was fit from activity onset data for 5 d prior to the light pulse. 
The postpulse regression line was determined from the first through 
seventh days in D:D following the pulse [85]. 

ShRNA-Mediated Knockdown and Kinetic Luminescence 

Cell culture. NIH 3T3 mouse fibroblasts were cultured in 
DMEM supplemented with 10% fetal bovine serum and anti- 
biotics, and grown to confluence prior to bioluminescence record- 
ing or harvesting for mRNA time courses. 

Lentivirus. Lentiviral particles were produced by transient 
transfection in HEK 293T cells using the calcium-phosphate 
method as previously described [86]. Infectious lentiviruses were 
harvested at 48 h post-transfection and used to infect NIH 3T3 
ceUs. NIH 3T3 ceUs were first infected witii pLV7-P(Bmall)-dLuc 
reporter fiiUowed by blasticidin selection to generate 3T3 reporter 
cells [27]. 

shRNA. Seven shRNAs targeting different regions of Chrono 
gene were designed. A nonspecific (NS) shRNA construct was used 
as a control. Synthetic oligonucleotides were annealed and cloned 
into pENTR/U6 (Invitrogen) and subsequently cloned into the 
pLL3.7GW vector as previously described [27]. The NIH 3T3 
reporter cells were then infected with shRNA viruses. 

Western. A fragment of Chrono opening reading frame (nts 
352-1 128) was first cloned into p3xFlag-CMV-14 vector and cot- 
ransfected with pLL3.7GW-shRNA into NIH 3T3 cells. shRNA 
knockdown efficiency was determined by Western blot analysis. 

Primers used for cloning were as follows: forward primer, GAA- 
TTCccaccatggaactccaagggttcatacggcccctca (EcoRI) ; reverse primer, 
TCTAGA gggctgaggatccggagcaactgg (Xbal). 

Cell harvest and qPCR. Total RNAs from NIH 3T3 cells 
were first prepared using Trizol reagents (Invitrogen) followed by 
further purification using RNeasy mini kit (Qiagen). Reverse 
transcription and qPCR were performed as previously described 
[27] except that probe and primers for Chrono were purchased 
from ABI. Transcript levels for each gene were normalized to Gapdh. 
Average relative expression ratios for each gene were expressed as a 
percentage of the maximum ratio at peak expression. 

Bioluminescence recording and data analysis. Biolu- 
minescence patterns of NIH 3T3 reporter cells were monitored 
using a LumiCycle luminometer (Actimetrics) as previously 
described [27]. Raw data were plotted. The period of the resulting 
luminescence data was determined through the WaveClock algo- 
rithm as implemented in R [87]. The median value of the period 
corresponding to the "total mode" was used. Amplitude was 
determined by regression to a sinusoidal waveform with the 
established period. To assess significance of period and amplitude 
changes, results for the various Chrono shRNAs were pooled and 
compared to control using the nonparametric Wilcoxon sum rank 
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test [wilcox.testQ function in R]. Both the reduction in amplitude 
and increase in period were significant at p<Q.Q5. The data were 
also fit to a mixed effects model using the R package "lme4" [88]. 
This model incorporated a fixed effect term for Chrono knockdown 
along with a nested, random effects term for the distinct shRNAs. 
This model explicitly accounts for the added variance resulting 
from the distinct shRNA constructs in a more nuanced fashion and 
also demonstrated a significant {p<.05) reduction in amplitude 
along with a trend {p = 0.08) for increasing period. 

Reverse Transcription and qPCR 

We used 1 [Lg total RNA to generate cDNA with the High 
Capacity cDNA Archive Kit using the manufacturer's protocol 
(Applied Biosystems). qPCR reactions were performed using iTaq 
PGR mastermix (BioRad) in combination with gene expression 
assays (Applied Biosystems) on a 7800HT Taqman machine (Applied 
Biosystems). Importin 8 was used as an endogenous control for all 
experiments. 

Recombinant Genetic Techniques 

Mammalian two-hybrid txjnstructs were generated by PGR with 
primers containing the flanking restriction sites that allow for in- 
frame cloning of the full-length ORF (not including the start ATG 
codon) into pAGT or pBIND plasmids (Promega). The two-hybrid 
reporter plasmid pGL4P-4XUAS was generated by inserting 4x 
repeats of the Gal4 UAS binding sites into the pGL4P vector 
(Promega). 

Epitope-tagged cDNAs were generated by PGR with primers 
containing the flanking restriction sites that allow for in-frame 
cloning of the full-length ORF (not including the start-ATG 
codon) into pFlag [49] or pTagSG plasmids (Stratagene). 

For plasmids expressing S-tagged GHRONO (both wild-type 
and truncation mutants), full-length and truncated DNA frag- 
ments of the gene were amplified with upstream and downstream 
primers containing S-tag-encoding sequence (KETAAAKFERQ; 
HMDS) and were subcloned into pGMV SportG or pcDNAS. 1 expres- 
sion vectors (Invitrogen) using NotI and Xhol restriction enzymes. 

shRNA Construct Sequences 

The shRNA construct sequences were as follows: NS shRNA, 
GAAGAAGATGAAGAGGAGG; Sh234, GAGTGGAGTTG- 
GATGGTAT; Sh235, GAGGGAGGATTGGTGTGAT; Sh236, 
GAGTTGGTTTGCTGACATA; Sh237, GGAGAACGTTAT- 
GTAGGAA; Sh238, GGAGGGTGGTTGGGAGAGT; Sh239, 
GAAGGTTGGTGGAGGTGGA; and Sh240, GTGTGATGGT- 
TGTGGTGGA. 

Sh 238, 239, and 240 were ultimately found to be inefiective by 
Western and/or PGR. 

Taqman Probe Identifiers 

The Taqman probe identifiers were as follows: For Mus musculus: 
Amtl, Mm00500226_ml; Amtl2, Mm00549497_ml; Perl, Mm 
00501813_ml; Per2, Mm004781 13_ml; Per3, Mm00478120_ml; 
Midi, Mm00520708_ml; Chrono (Gml29), Mm01255906_gl; 
Importin 8, MmO 12551 58_m 1 . For Homo sapiais: Amtl, HsOO 1541 47_m 1 ; 
Amtl2, Hs 00368068_ml; Clock, Hs00231857_ml; Perl, HsOO 
242988_ml; Per2, Hs00256144_ml; Nrldl, Hs00253876_ml; 
Chrono (GlorfSl), Hs00328968_ml; Gapdh, Hs99999905_ml. 

Supporting Information 

Figure SI (A) Ten-fold cross-validation for machine learning 
approach. Two of the exemplar clock components (2/17, ~10%) 
were removed from the exemplar-training list and evidence factors 



were recomputed based on the reduced list. Genes were reranked 
on the posterior probability of having a core clock function. We 
then recorded the ranking of the known clock components that 

had been excluded from the training set. This procedure was 
repeated after sequentially withholding all 156 possible pairs of 
exemplar components. (Main) The fraction of the test (withheld) 
clock components recovered using a given ranking cutoff (labeled 
sensitivity) is plotted as a function of the ranking cutoff. The evi- 
dence factor approach is compared to prepackaged implementa- 
tions of a Normal/ Gaussian Naive Bayesian classifier and a 
Flexible Naive Bayesian classifier. (Insert) Focused view on algo- 
rithm performance using cutoff rankings bellow 400. (B) Venn/ 
Euler Diagram showing overlap among the top 50 candidates 
clock components as assessed by each of the three different 
machine learning algorithms. (C) Estimated FDR for evidence 
factor approach under different assumptions of core clock network 
size. The number of true core clock components is assumed to be 
25, 50, or 75 genes as shown. The numbers of true and false pos- 
itives were estimated from the number of true clock components, 
test sensitivity, and cutoff number to be screened. A dashed ver- 
tical line corresponding to a screening of the top 50 candidates is 
shown to facilitate comparison. 
(TIFF) 

Figure S2 Initial characterization of candidate genes. 

Mammalian two-hybrid screening and kinetic luminescence 
imaging were used to select high-probability candidate genes for 
more detailed evaluation. (A) The top 25 novel candidate genes 
(not in the exemplar distribution) were screened for physical 
interactions with the listed subset of clock factors. When fused with 
the VP 16 activation domain, Gystathionine Beta Synthase (GBS) 
and Interferon-induced Transmembrane Protein 1 (IFITMl) 
demonstrated binding with a >5-fold activation of the Gal4 
UAS reporter over control. GM129/CHRONO was screened in 
the same way and bound BMALl and PER2 as shown in Figure 3. 
As compared to NS siRNA control, siRNA mediated knockdown 
of (B) Cbs and (G) Ifitml altered rhythms in synchronized NIH 3T3 
fibroblasts expressing a BMAL:dLUG reporter. Data shown are 
mean ± standard deviation of four replicates. On initial testing, 
other genes among the top 25 candidates demonstrated knock- 
down phenotype in the NIH 3T3 system or evidence of binding, 
but not both. 
(TIFF) 

Figure S3 The effect of core circadian oscillator muta- 
tions on Chrono expression. (A) Time course microarray data 
from Miller et al. [46] including wild-type and Clock mutant 
animals are plotted. Data shown are average of two biological 

replicates. In both liver and SCN, GLOGK mutation affects Chrono 
expression level and rhythmicity. (B) Time course microarray data 
from VoUmers et al. [48] describing hepatic transcription from 
WT and Cryl / Ciy2 double knockout mice under different feeding 
protocols. Data were downloaded from the NIH GEO repository. 
GGRMA-normalized probeset values describing Perl and Chrono/ 
Gml29 expression are shown. 
(TIFF) 

Figure S4 The influence of CHRONO on Nrldl expres- 
sion in ceUs overexpressing wild-type BMALl /CLOCK 
or CRY-resistant BMALl/CLOCK point mutants. The 

indicated plasmids were cotransfected into HEK 293T cells and 
Nrldl expression was determined by qPGR. Average activities 
and standard deviations were determined from independent 
biological triplicates. 
(TIFF) 
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Figure S5 Confirmation of shRNA efiicacy for kinetic 
luminescence experiments. (A) Protein abundance was 
assessed by Western blot analysis using anti-Flag antibody in 
NIH 3T3 cells cotransfected with shRNA and Flag-tagged cDNA. 
(B) The efficiency of shRNA-mediated knockdown on endogenous 
transcript expression was measured by qPCR. 
(TIFF) 

Figure S6 Confirmation of knockout mouse genotype. 

(A) Schematic representation of wild-type (+) or transgenic allele 
{Chrono^) with knockout-first-reporter tagged insertion (KOMP 
repository). The transgenic allele is nonfunctional by virtue of the 
SV40 polyadenylation sequence (pA) inserted in the vector that 
acts like a STOP codon. The small arrows (a and b) show the 
location and direction of PGR genotyping primers. (B) PGR 
genotyping of DNA extracted from mouse toes of WT (Chrorw^'^), 
heterozygous [Chrono^'^), and homozygous [Chrono^^ offspring. 
The arrows (a and b) indicate PGR products corresponding to the 
targeted alleles. A size marker is shown in column M. (C) qPGR 
analysis for Chrono mRNA expression in WT, heterozygous, and 
homozygous Ghrono knockout mice with five different qPGR 
primer/probes. 
(TIFF) 

Figure S7 Additional data for Figure 6. (A) Western blot 
showing prolixin bands for GBP-VN and VG-BMALl in the 
absence or presence of S-Tagged GHRONO, SPORT6, or 
SPORT6-S. Invariant protein levels suggest that changes in 
complementation signal (Figure 6B and G) result from changes in 
protein complex formation rather than changes in GBP— VN or 
VC-BMALl abundance. (B) Only S-tagged GHRONO constructs 
containing the 108-285 region repress CLOCK/BMALl-medi- 
ated Perl-Luciferase reporter activity. (C) GHRONO truncation 
mutants were coexpressed along with a BMALl-GFP construct. 
Cellular localization was visualized via IF analysis using an S-tag 
antibody. Intact GHRONO and truncation mutants containing 
the 108-285 region colocalized with BMALl-GFP in the nucleus. 
(D) Real-time bioluminescence analysis using the PeT2:\'ac reporter 
cells (U20S) stably expressing the indicated constructs. Data 
shown are the average of four independent experiments. (E) 
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Quantitative analysis of amplitudes of oscillations shown in (D). 
Error bar indicates standard error of the mean. 
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Figure S8 The influence of CHRONO on PER2/NR1D1 
complex formation. HEK 293T cells were transfected with 
plasmids encoding PER2-Venus, NRIDI-GFP, and GHRONO- 

S as indicated in the figure. Endogenous protein was immuno- 
precipitated with anti-GFP (which also targets Venus) followed by 
immunoblotting as indicated. PER2/NR1D1 binding appears 
enhanced in the presence of GHRONO. 

(TIFF) 

Table SI Data matrix showing average fold-activation 
of the 4XUAS:Luciferase reporter (±S.D.) with specified 
Gal4 and VP16 fusion constructs cotransfected into HEK 
293T cells. 

(DOG) 

Table S2 Excel file containing feature metrics for the 
top 1,000 genes as assessed by evidence factor ranks. 

The ranking of these gene obtained using the Gaussian and 

Flexible Naive Bayes classifiers are also reported. 
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